In the last exercise, we made use of some regression models for our analysis. That was a good start, but publications need tables of results and fancy visualizations (e.g., for model predictions).
Again, we need to load the GESIS Panel COVID-19 survey data and also - just to make sure - re-run the regression models:
library(dplyr)
library(haven)
library(sjlabelled)
gp_covid <-
read_sav(
"../data/ZA5667_v1-1-0.sav"
) %>%
set_na(na = c(-1:-99, 97)) %>%
mutate(
likelihood_hospital = hzcy003a,
age_cat = as.factor(age_cat),
likelihood_hospital_cut =
ifelse(
likelihood_hospital > median(likelihood_hospital, na.rm = TRUE),
1,
0
)
) %>%
remove_all_labels()
serious_logistic_regression <-
glm(
likelihood_hospital_cut ~ age_cat + sex + political_orientation + marstat,
data = gp_covid,
family = binomial(link = "logit")
)
serious_cauchit_regression <-
glm(
likelihood_hospital_cut ~ age_cat + sex + political_orientation + marstat,
data = gp_covid,
family = binomial(link = "cauchit")
)
In the lecture, we started with the visualization of predictions. However, it is good practice to first build a regression table that shows, for example, the statistical significance of regression coefficients.
stargazer regression table of both models to compare them. Use the option type = "text" to have it printed to the console. Moreover, choose a table style for a journal of your choice.
?stargazer::`list of supported styles`.
You know what: reviewer 2 is “post-significant” (at least a little bit). He requests that you should please use confidence intervals at a level of .93 (which is now all the rage in his field).
??stargazer. Note that it is possible be that CIs are not shown because of the custom style you choose.
The previous exercise might also show that integrated packages like stargazer have limits. Because of these limits, standardized procedures of gathering results as offered by the broom package are quite attractive since packages like, e.g., huxtable ‘understand’ this standard format.
We’ll leave regression tables now and turn to plotting the results (as reviewer 2 wants you to do that after the third review round)…
age_cat for both models using the sjPlot package. Also, try to plot them side by side.
patchwork package for creating the combined plot.
Reviewer 2 is happy now. But Reviewer 1, who is more trusted by the editor, thinks estimating predictions based on ‘real’ average marginal effects (AME) is more cutting-edge. She also feels that you have proven that choosing a logit or a cauchit link doesn’t make a difference. Therefore she’d prefer to see only one plot for the cauchit model as this is not a methodological paper.
age_cat only for the cauchit model, but this time based on AME.
margins package and its cplot() function are your friends here.
Bonus: You may have seen that cplot() outputs the actual prediction data automatically. This was annoying in the slides, but you could use this output to create a plot with ggplot2. Note: When you set the argument draw = FALSE only the data are created.
age_cat only for the cauchit model based on AME using ggplot2.
xvals is stored as character. You should convert it to numeric.geom called geom_errorbar which you can use for the confidence intervals